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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/774 , 954 



DATE: 08/12/2004 
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Input Set : N:\Crf3\RULE60\09774954.raw 
Output Set: N:\CRP4\08122004\I774954.raw 

SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Yang Wang, Michael W. Spellman 
(ii) TITLE OF INVENTION: O-Fucosyl transferase 
(iii) NUMBER OF SEQUENCES: 17 
(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genentech ( Inc. 

(B) STREET: 1 DNA Way 

(C) CITY: South San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 94080 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5 inch, 1.44 Mb floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: WinPatin (Genentech) 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER : US/09/774,954 

(B) FILING DATE: 30-Jan-2001 

(C) CLASSIFICATION: 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US/08/978,741 

(B) FILING DATE: 26-NOV-1997 

(A) APPLICATION NUMBER: 08/792,498 

(B) FILING DATE: 31-JAN-1997 
( vi i i ) ATTORNEY/ AGENT INFORMATION : 

(A) NAME: Svoboda, Craig G. 

(b) registration number: 3 9,044 

(c) reference / docket number: p1041p1 
(ix) telecommunication information: 

(a) Telephone: 650/225-1489 

(b) telefax: 650/952-9881 
(2) information for seq id no: 1: 

(i) sequence characteristics: 

(A) LENGTH: 1514 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
ATGCCCGCGG GCTCCTGGGA CCCGGCCGGT TACCTGCTCT ACTGCCCCTG 
CATGGGGCGC TTTGGGAACC AGGCCGATCA CTTCTTGGGC TCTCTGGCAT 
TTGCAAAGCT GCTAAACCGT AC CTTGGCTG TCCCTCCTTG GATTGAGTAC 
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RAW SEQUENCE LISTING DATE : 08/Z2/2004 

PATENT APPLICATION: US/09/774 , 954 TIME: 18:36:35 

Input Set : N:\Crf3\RULE60\09774954.raw 
Output Set: N:\CRF4\08122004\I774954.raw 

62 CAGCATCACA AGCCTCCTTT CACCAACCTC CATGTGTCCT ACCAGAAGTA 2 00 
64 CTTCAAGCTG GAGGCCCTCC AGGCTTACCA TCGGGTCATC AGCTTGGAGG 250 

6 6 ATTTCATGGA GAAGCTGGCA CCCACCCACT GGCCCCCTGA GAAGCGGGTG 3 00 
68 GCATACTGCT TTGAGGTGGC AGCCCAGCGA AGCCCAGATA AGAAGACGTG 35 0 
70 CCCCATGAAG GAAGGAAACC CCTTTGGCCC ATTCTGGGAT CAGTTTCATG 400 
72 TGAGTTTCAA CAAGTCGGAG CTTTTTACAG GCATTTCCTT CAGTGCTTCC 450 
74 TACAGAGAAC AATGGAGCCA GAGATTTTCT CCAAAGGAAC ATCCGGTGCT 50 0 
76 TGCCCTGCCA GGAGCCCCAG CCCAGTTCCC CGTCCTAGAA GAACACAGGC 550 

7 8 CACTACAGAA GTACATGGTA TGGTCAGACG AAATGGTGAA GACGGGAGAG 600 
80 GCCCAGATTC ATGCCCACCT TGTCCGGCCC TATGTGGGCA TTCATCTGCG 650 
82 CATTGGCTCT GACTGGAAGA ACGCCTGTGC CATGCTGAAG GACGGGACTG 700 
84 CAGGCTCGCA CTTCATGGCC TCTCCGCAGT GTGTGGGCTA CAGCCGCAGC 750 
86 ACAGCGGCCC CCCTCACGAT GACTATGTGC CTGCCTGACC TGAAGGAGAT 800 
88 CCAGAGGGCT GTGAAGCTCT GGGTGAGGTC GCTGGATGCC CAGTCGGTCT 850 
90 ACGTTGCTAC TGATTCCGAG AGTTATGTGC CTGAGCTCCA ACAGCTCTTC 900 
92 AAAGGGAAGG TGAAGGTGGT GAGCCTGAAG CCTGAGGTGG CCCAGGTCGA 950 
94 CCTGTACATC CTCGGCCAAG CCGACCACTT TATTGGCAAC TGTGTCTCCT 100 0 
96 CCTTCACTGC CTTTGTGAAG CGGGAGCGGG AC CT CCAGGG GAGGCCGTCT 1050 
98 TCTTTCTTCG GCATGGACAG GCCCCCTAAG CTGCGGGACG AGTTCTGATT 1100 
100 CTGGCCGGAG CACCAGACCC TCTGATCCTG GAGGGACCAG AGTCTGAGCT 1150 
102 GGTCCTTCCA GCCAGGCCTG GCAGCCAGAG GTGCTCCGGG ATTGCAAACT 120 0 
104 CCTCTTCTCA CCTGCCAAAG ATGGAGAAGA GTGCCAGGGA CCCCTCAAGG 1250 
106 AGGGAGACGC TCCATATCCC AGGGCATAGG ACTTGCAGGT TCCTAGGAGC 13 00 
108 AGGAGCATCT CCCATCGCAC GTGCTTTCTG CTCTTCTGGG AATTTCTCAC 1350 
110 ACTGGCAAAG CAGTCCAGCC TCCGTCTTCT GGTCCACTCT GCTCTGAGCA 1400 
112 GCCTGGGATG CTGAACTCTT CAGAGAGATT TTTTTATAGA GAGATTTCTA 1450 
114 TAATTTTGAT ACAAGGTCAT GACTATC CTA GAACTCTCTG TGGTTTTTGA 1500 
116 AAATCATTGA ATTC 1514 

118 (2) INFORMATION FOR SEQ ID NO: 2: 

12 0 (i) SEQUENCE CHARACTERISTICS: 

121 (A) LENGTH: 365 amino acids 

122 (B) TYPE: Amino Acid 

123 (D) TOPOLOGY: Linear 

125 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

12 7 Met Pro Ala Gly Ser Trp Asp Pro Ala Gly Tyr Leu Leu Tyr Cys 
128 1 5 10 15 

130 Pro Cys Met Gly Arg Phe Gly Asn Gin Ala Asp His Phe Leu Gly 

13 1 20 25 30 

13 3 Ser Leu Ala Phe Ala Lys Leu Leu Asn Arg Thr Leu Ala Val Pro 



134 35 



40 45 



13 6 Pro Trp lie Glu Tyr Gin His His Lys Pro Pro Phe Thr Asn Leu 



137 50 
139 

140 65 



55 60 



His Val Ser Tyr Gin Lys Tyr Phe Lys Leu Glu Pro Leu Gin Ala 



70 75 



142 Tyr His Arg Val He Ser Leu Glu Asp Phe Met Glu Lys Leu Ala 

143 80 85 90 
145 Pro Thr His Trp Pro Pro Glu Lys Arg Val Ala Tyr Cys Phe Glu 

95 100 105 

Val Ala Ala Gin Arg Ser Pro Asp Lys Lys Thr Cys Pro M£t Lys 



146 
148 
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Glu Gly Asn Pro Phe Gly Pro Phe Trp Asp Gin Phe His Val Ser 

130 135 



RAW SEQUENCE LISTING ^TE : J8/12/2004 

PATENT APPLICATION: US/09/774 , 954 TXME: 18:36:35 

Input Set : N:\Crf3\RTJLB60\09774954.raw 
Output Set: N:\CRF4\08122004\l774954.raw 

n*n 115 120 

149 110 ■ LX3 

151 

III Phe Asn Lys Ser Glu Leu Phe Thr Gly He Ser Phe Ser Ala Ser 

140 I 45 
111 Tyr Arg Glu Gin Trp Ser Gin Arg Phe Ser Pro Lys Glu His Pro 
158 155 160 I" 

160 Val Leu Ala Leu Pro Gly Ala Pro Ala Gin Phe Pro Val Leu Glu 

170 175 
^63 Glu His Arg Pro Leu Gin Lys Tyr Met Val Trp Ser Asp Glu Met 

185 190 195 

166 Val Lys Thr Gly Glu Ala Gin He His Ala His Leu Val Arg Pro 

167 200 205 210 
169 
170 



173 
175 
176 
178 
179 
181 
182 
184 
185 
1B7 



164 185 190 195 

Val Lys Thr Gly Glu 

200 - — 

Tyr Val Gly He His Leu Arg He Gly Ser Asp Trp Lys Asn Ala 
ivo 215 220 225 

X12 Cys Ala Met Leu Lys Asp Gly Thr Ala Gly Ser His Phe Met Ala 

230 235 240 

Ser Pro' Gin Cys Val Gly Tyr Ser Arg Ser Thr Ala Ala Pro Leu 
176 2 4S 250 255 

178 Thr Met Thr Met Cys Leu Pro Asp Leu Lys Glu He Gin Arg Ala 

260 265 270 

val Lys Leu Trp Val Arg Ser Leu Asp Ala Gin Ser Val Tyr Val 
182 275 280 285 

Ala Thr Asp Ser Glu Ser Tyr Val Pro Glu Leu Gin Gin Leu Phe 
185 290 295 300 

Lys Gly Lys Val Lys Val Val Ser Leu Lys Pro Glu Val Ala Gin 
188 305 310 315 

190 Val Asp Leu Tyr He Leu Gly Gin Ala Asp His Phe He Gly Asn 

191 320 325 330 

193 Cys Val Ser Ser Phe Thr Ala Phe Val Lys Arg Glu Arg Asp Leu 

194 335 340 345 

196 Gin Gly Arg Pro Ser Ser Phe Phe Gly Met Asp Arg Pro Pro Lys 

197 350 355 360 

199 Leu Arg Asp Glu Phe 

200 365 

202 (2) INFORMATION FOR SEQ ID NO: 3: 

204 (i) SEQUENCE CHARACTERISTICS: 

205 (A) LENGTH: 61 amino acids 

206 (B) TYPE: Amino Acid 

207 (□) TOPOLOGY: Linear 

209 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

211 Arg Leu Ala Gly Ser Trp Asp Leu Ala Gly Tyr Leu Leu Tyr Xaa 

212 1 5 10 15 

214 Pro Xaa Met Gly Arg Phe Gly Asn Gin Ala Asp His Phe Leu Gly 

215 20 25 30 

217 Ser Leu Ala Phe Ala Lys Leu Xaa val Arg Thr Leu Ala Val Pro 

218 35 40 « 



220 Pro Trp He Glu Tyr Gin His His Lys Pro Pro Phe Thr Asn Leu 

55 60 



221 50 
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223 His . 

224 61 

226 (2) INFORMATION FOR SEQ ID NO : 4: 

228 (i) SEQUENCE CHARACTERISTICS: 

229 (A) LENGTH: 1300 base pairs 

230 (B) TYPE: Nucleic Acid 

231 (C) STRANDEDNESS : Single 

232 (D) TOPOLOGY: Linear 

234 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

237 TTATTCATAC CGTCCCACCA TCGGGCGCGG ATCAGATCCA TGGCCAAGTT 50 
239 CCTGGTCAAC GTGGCCCTGC TGCTGCTGCT GCTGCTGCTG TCCGGAGCCT 100 
241 GGGCCCATAT GAGATCCCAT CACCATCACC ATCACATGCC CGCGGGCTCC 150 
243 TGGGACCCGG CCGGTTACCT GCTCTACTGC CCCTGCATGG GGCGCTTTGG 200 
245 GAACCAGGCC GATCACTTCT TGGGCTCTCT GGCATTTGCA AAGCTGCTAA 250 
247 ACCGTACCTT GGCTGTCCCT CC.TTGGATTG AGTACCAGCA TCACAAGCCT 300 
24 9 CCTTTCACCA ACCTCCATGT GTCCTACCAG AAGTACTTCA AGCTGGAGCC 350 
251 CCTCCAGGCT TAC CATCGGG TCATCAGCTT GGAGGATTTC ATGGAGAAGC 400 
253 TGGCACCCAC CCACTGGCCC CCTGAGAAGC GGGTGGCATA CTGCTTTGAG 450 
255 GTGGCAGCCC AGCGAAGCCC AGATAAGAAG ACGTGCCCCA TGAAGGAAGG 500 
257 AAACCCCTTT GGCCCATTCT GGGATCAGTT TCATGTGAGT TTCAACAAGT 550 
259 CGGAGCTTTT TACAGGCATT TCCTTCAGTG CTTCCTACAG AGAACAATGG 600 
261 AGCCAGAGAT TTTCTCCAAA GGAACATCCG GTGCTTGCCC TGCCAGGAGC 650 
263 CCCAGCCCAG TTCCCCGTCC TAGAGGAACA CAGGCCACTA CAGAAGTACA 700 
265 TGGTATGGTC AGACGAAATG GTGAAGACGG GAGAGGCCCA GATTCATGCC 750 
267 CACCTTGTCC GGCCCTATGT GGGC ATT CAT CTGCGCATTG GCTCTGACTG 800 
269 GAAGAACGCC TGTGCCATGC TGAAGGACGG GACTGCAGGC TCGCACTTCA 850 
271 TGGCCTCTCC GCAGTGTGTG GGCTACAGCC GCAGCACAGC GGCCCCCCTC 900 
273 ACGATGACTA TGTGCCTGCC TGACCTGAAG GAGATCCAGA GGGCTGTGAA 950 
275 GCTCTGGGTG AGGTCGCTGG ATGCCCAGTC GGTCTACGTT GCTACTGATT 1000 
277 CCGAGAGTTA TGTGCCTGAG CTCCAACAGC TCTTCAAAGG GAAGGTGAAG 1050 
279 GTGGTGAGCC TGAAGCCTGA GGTGGCCCAG GTCGACCTGT ACATCCTCGG 1100 
281 CCAAGCCGAC CACTTTATTG GCAACTGTGT CTCCTCCTTC ACTGCCTTTG 1150 
283 TGAAGCGGGA GCGGGACCTC CAGGGGAGGC CGTCTTCTTT CTTCGGCATG 1200 
285 GACAGGCCCC CTAAGCTGCG GGACGAGTTC TGATTCTGGC CGGAGCACCA 1250 
287 GACCCTCTGA TCCTGGAGGG ACCAGAGTCT GAGCTGGTCC TTCCAGCCAG 13 00 

2 89 {2) INFORMATION FOR SEQ ID NO: 5: 

291 (i) SEQUENCE CHARACTERISTICS : 

292 (A) LENGTH: 11284 base pairs 

293 (B) TYPE: Nucleic Acid 

294 (C) STRANDEDNESS: Single 

295 (D) TOPOLOGY: Linear 

297 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

300 AAGCTTTACT CGTAAAGCGA GTTGAAGGAT CATATTTAGT TGCGTTTATG 50 

3 02 AGATAAGATT GAAAGCACGT GTAAAATGTT TCCCGCGCGT TGGCACAACT 100 
3 04 ATTTACAATG CGGCCAAGTT ATAAAAGATT CTAATCTGAT ATGTTTTAAA 15 0 
306 ACACCTTTGC GGCCCGAGTT GTTTGCGTAC GTGACTAGCG AAGAAGATGT 200 
308 GTGGACCGCA GAACAGATAG TAAAACAAAA CCCTAGTATT GGAGCAATAA 250 
310 TCGATTTAAC CAACACGTCT AAATATTATG ATGGTGTGCA TTTTTTGCGG 300 
312 GCGGGCCTGT TATACAAAAA AATTCAAGTA CCTGGCCAGA CTTTGCCGCC 350 
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314 TGAAAGCATA GTTCAAGAAT TTATTGACAC GGTAAAAGAA TTTACAGAAA 400 
316 AGTGTCCCGG CATGTTGGTG GGCGTGCACT GCACACACGG TATTAATCGC 450 
318 ACCGGTTACA TGGTGTGCAG ATATTTAATG CACACCCTGG GTATTGCGCC 500 
320 GCAGGAAGCC ATAGATAGAT TCGAAAAAGC CAGAGGTCAC AAAATTGAAA 550 
322 GACAAAATTA CGTTCAAGAT TTATTAATTT AATTAATATT ATTTGCATTC 6 00 
324 TTTAACAAAT ACTTTATCCT ATTTTCAAAT TGTTGCGCTT CTTCCAGCGA 650 
326 ACCAAAACTA TGCTTCGCTT GCTCCGTTTA GCTTGTAGCC GATCAGTGGC 700 
328 GTTGTTCCAA TCGACGGTAG GATTAGGCCG GATATTCTCC ACCACAATGT 750 
330 TGGCAACGTT GATGTTACGT TTATGCTTTT GGTTTTCCAC GTACGTCTTT 800 
3 32 TGGCCGGTAA TAGCCGTAAA CGTAGTGCCG TCGCGCGTCA CGCACAACAC 850 
3 34 CGGATGTTTG CGCTTGTCCG CGGGGTATTG AACCGCGCGA TCCGACAAAT 900 
336 CCACCACTTT GGCAACTAAA TCGGTGACCT GCGCGTCTTT TTTCTGCATT 950 
3 38 ATTTCGTCTT TCTTTTGCAT GGTTTCCTGG AAGCCGGTGT ACATGCGGTT 1000 
340 TAGATCAGTC ATGACGCGCG TGACCTGCAA ATCTTTGGCC TCGATCTGCT 1050 
342 TGTCCTTGAT GGCAACGATG CGTTCAATAA ACTCTTGTTT TTTAACAAGT 1100 
344 TCCTCGGTTT TTTGCGCCAC CACCGCTTGC AGCGCGTTTG TGTGCTCGGT 115 0 
346 GAATGTCGCA ATCAGCTTAG TCACCAACTG TTTGCTCTCC TCCTCCCGTT 1200 
348 GTTTGATCGC GGGATCGTAC TTGCCGGTGC AGAGCACTTG AGGAATTACT 1250 
350 TCTTCTAAAA GCCATTCTTG TAATTCTATG GCGTAAGGCA ATTTGGACTT 130 0 
352 CATAATCAGC TGAATCACGC CGGATTTAGT AATGAGCACT GTATGCGGCT 1350 
354 GCAAATACAG CGGGTCGCCC CTTTTCACGA CGCTGTTAGA GGTAGGGCCC 1400 
356 CCATTTTGGA TGGTCTGCTC AAATAACGAT TTGTATTTAT TGTCTACATG 1450 
3 58 AACACGTATA GCTTTATCAC AAACTGTATA TTTTAAACTG TTAGCGACGT 1500 
360 CCTTGGCCAC GAACCGGACC TGTTGGTCGC GCTCTAGCAC GTACCGCAGG 1550 
362 TTGAACGTAT CTTCTCCAAA TTTAAATTCT CCAATTTTAA CGCGAGCCAT 1600 
364 TTTGATACAC GTGTGTCGAT TTTGCAACAA CTATTGTTTT TTAACGCAAA 1650 
366 CTAAACTTAT TGTGGTAAGC AATAATTAAA TATGGGGGAA CATGCGCCGC 1700 
368 TACAACACTC GTCGTTATGA ACGCAGACGG CGCCGGTCTC GGCGCAAGCG 1750 
370 GCTAAAACGT GTTGCGCGTT CAACGCGGCA AACATCGCAA AAGCCAATAG 1800 
372 TACAGTTTTG ATTTGCATAT TAACGGCGAT TTTTTAAATT ATCTTATTTA 1850 
374 ATAAATAGTT ATGACGCCTA CAACTCCCCG CCCGCGTTGA CTCGCTGCAC 1900 
376 CTCGAGCAGT TCGTTGACGC CTTCCTCCGT GTGGCCGAAC ACGTCGAGCG 1950 
37 8 GGTGGTCGAT GACCAGCGGC GTGCCGCACG CGACGCACAA GTATCTGTAC 2000 
380 ACCGAATGAT CGTCGGGCGA AGGCACGTCG GCCTCCAAGT GGCAATATTG 205 0 
382 GCAAATTCGA AAATATATAC AGTTGGGTTG TTTGCGCATA TCTATCGTGG 2100 
384 CGTTGGGCAT GTACGTCCGA ACGTTGATTT GCATGCAAGC CGAAATTAAA 2150 
386 TCATTGCGAT TAGTGCGATT AAAACGTTGT ACATCCTCGC TTTTAATCAT 2200 
388 GCCGTCGATT AAATCGCGCA ATCGAGTCAA GTGATCAAAG TGTGGAATAA 2250 
390 TGTTTTCTTT GTATTCCCGA GTCAAGCGCA GCGCGTATTT TAACAAACTA 2300 
3 92 GCCATCTTGT AAGTTAGTTT CATTTAATGC AACTTTATCC AATAATATAT 2350 
3 94 TATGTATCGC ACGTCAAGAA TTAACAATGC GCCCGTTGTC GCATCTCAAC 2400 
396 ACGACTATGA TAGAGATCAA ATAAAGCGCG AATTAAATAG CTTGCGACGC 2450 

3 98 AACGTGCACG ATCTGTGCAC GCGTTCCGGC ACGAGCTTTG ATTGTAATAA 25 00 
400 GTTTTTACGA AGCGATGACA TGACCCCCGT AGTGACAACG ATCACGCCCA 2550 
402 AAAGAACTGC CGACTACAAA ATTACCGAGT ATGTCGGTGA CGTTAAAACT 2600 

4 04 ATTAAGCCAT CCAATCGACC GTTAGTCGAA TCAGGACCGC TGGTG CGAGA 2650 
406 AGCCGCGAAG TATGGCGAAT GCATCGTATA ACGTGTGGAG TCCGCTCATT 2700 
408 AGAGCGTCAT GTTTAGACAA GAAAGCTACA TATTTAATTG ATCCCGATGA 2750 
410 TTTTATTGAT AAATTGACCC TAACTCCATA CACGGTATTC TACAATGGCG 2 800 
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